Skip to content

Conversation

@jeremypvx
Copy link

We do not need to emit 'end' here, it will be called automatically by the node stream implementation. The problem is the following:

Lets assume we have a read stream on a ZIP file, that we want to parse. We use the following construct:

const zip = unzip.Parse();
myStream.pipe(zip);

zip.on('data', data => {
// do something with the data
});

zip.on('end', () => {
// finish processing the data
});

What happens is that zip contains a write stream and a read stream. myStream.pipe(zip) will pipe the myStream stream into zip's write stream, while zip's read stream will be consumed by the 'data' event listener. The 'finish' event pertains to the write stream, while 'end' pertains to the read stream. Thus, with the assumption that finishing the write stream means that the read stream should also finish, as any data that was processed by the write stream before finishing would have resulted in a 'data' event being emitted by the read stream, we emit 'end' when the write stream finishes.

However, the autodrain feature seems to interfere with this assumption. As described in the documentation: 'If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's contents. Otherwise the stream will halt.' This can cause the following sequence of events:

  • The read stream is paused, because the current entry is not being consumed, or rather is taking a while to consume (for instance, because it is a large file)
  • The 'finish' event is emitted by the write stream, because it has finished parsing the zip file, which results in the 'end' event being emitted by the read stream
  • The reader finishes consuming the current entry, possibly calling autodrain(). Then it stops because it receives the end event, and does not expect any more data to come.
  • more 'data' events are emitted, because the read stream resumed sending them, after the entry was consumed / autodrained() was called

This effectively causes the reader to not be able to rely on the 'end' event, as more data may come afterwards.

We do not need to emit 'end' here, it will be called automatically by the node stream implementation.
The problem is the following:

Lets assume we have a read stream on a ZIP file, that we want to parse. We use the following construct:

const zip = unzip.Parse();
myStream.pipe(zip);

zip.on('data', data => {
  // do something with the data
});

zip.on('end', () => {
  // finish processing the data
});

What happens is that zip contains a write stream and a read stream. myStream.pipe(zip) will pipe the myStream stream into zip's write stream,
while zip's read stream will be consumed by the 'data' event listener.
The 'finish' event pertains to the write stream, while 'end' pertains to the read stream.
Thus, with the assumption that finishing the write stream means that the read stream should also finish,
as any data that was processed by the write stream before finishing would have resulted in a 'data' event being emitted by the read stream,
we emit 'end' when the write stream finishes.

However, the autodrain feature seems to interfere with this assumption. As described in the documentation:
'If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's contents. Otherwise the stream will halt.'
This can cause the following sequence of events:
- The read stream is paused, because the current entry is not being consumed, or rather is taking a while to consume (for instance, because it is a large file)
- The 'finish' event is emitted by the write stream, because it has finished parsing the zip file, which results in the 'end' event being emitted by the read stream
- The reader finishes consuming the current entry, possibly calling autodrain(). Then it stops because it receives the end event, and does not expect any more data to come.
- more 'data' events are emitted, because the read stream resumed sending them, after the entry was consumed / autodrained() was called

This effectively causes the reader to not be able to rely on the 'end' event, as more data may come afterwards.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant